The main data used in this tutorial and in the lecture are about the geolocalisation of french restaurants in Paris and in a department called Haute-Garonne. We use two different sources:

  1. an official register called SIRENE (Computer system for the business and establishment register) managed by the French National Institute of Statistics and Economic Studies (Insee) and geolocated by Etalab (French task force for Open Data). This register records the civil status of all companies and their establishments (including restaurants).

SIRENE has the advantages of being rigorous and exhaustive on the French territory.

  1. the famous global open access mapping project OpenStreetMap (OSM). We can access to OSM database using the osmdata R package.

OSM has many benefits, ensuring transparent data provenance and ownership, enabling real-time evolution of the database and, by allowing anyone to contribute, encouraging democratic decision making and citizen science.

Exercise 1 : Manipulate sf objects and associated data.frames

  1. Load in R the iris1 map layer ‘iris_31.rds’ of the french department called Haute-Garonne (numbered 31). Would the sf::st_read function also work? Why?
Use the readRDS function.
It would not work because ‘iris_31’ is not a shapefile but a file already R formatted. Simply load it with the readRDS function.
  1. Display the basemap of department 31 with plot(iris_31). What do you notice ?
We notice that R performs 3 graphs: one graph per variable in the sf object.
  1. What is the functionality of the sf::st_geometry function? What solution do you propose then?
sf::st_geometry makes it possible to isolate the information contained in the ‘geometry’ column of the sf object. Using it, we put aside other variables (here CODE_IRIS, P14_POP and INSEE_COM).
  1. In which projection is the map layer? Map it with another projection.

Information

Test the Azimuthal Equidistant projection with crs="+proj=aeqd +lat_0=90 +lon_0=0 to see a clear difference and create a layer called ‘iris_31_aeqd’.
Use the sf::st_crs function to guess the projection and sf::st_transform to change the projection.

  1. Calculate the distance matrix between the 5 first iris of department 31. Do you get the same distance matrix if you are working on a layer projected in another projection?

Information

Use map layers called ‘iris_31’ and ‘iris_31_aeqd’.
Units: m
         [,1]      [,2]     [,3]      [,4]     [,5]
[1,]     0.00 51153.341 56509.22 53137.481 13756.98
[2,] 51153.34     0.000 20957.21  3086.011 49828.91
[3,] 56509.22 20957.212     0.00 11077.630 61863.18
[4,] 53137.48  3086.011 11077.63     0.000 54345.46
[5,] 13756.98 49828.910 61863.18 54345.458     0.00
Units: m
         [,1]      [,2]     [,3]      [,4]     [,5]
[1,]     0.00 57206.922 62306.32 59385.205 13798.28
[2,] 57206.92     0.000 20957.48  3204.855 55368.23
[3,] 62306.32 20957.481     0.00 11076.220 66744.65
[4,] 59385.20  3204.855 11076.22     0.000 59843.30
[5,] 13798.28 55368.225 66744.65 59843.302     0.00
[1] FALSE
No, the two matrices are different.
  1. Using the layer called ‘iris_31’, create a new aggregated map layer called ‘com_31’ which corresponds to the municipalities of department 31. Also keep in this new layer the information on the population in each municipality.

Information

The map layer called ‘iris_31’ contains the 5 digit codes of municipalities in its variable INSEE_COM and the 2014 population in its column P14_POP.
Use the classic functions of dplyr package: select, group_by et summarize. These functions also work with sf objects.

  1. Using the data contained in ‘sir_31’, add to this layer the number of restaurants per municipality.

Information

The code of each municipality is not in the ‘sir_31’ database. To create it, you have to create a variable called INSEE_COM (5 digits) which concatenates the DEPET (2 digits) and COMET (3 digits) variables.
  1. Aggregate all the information present in ‘com_31’ at the level of french intercommunalites (called EPCI) and call this new layer ‘epci_31’.

Information

You have to use the database ‘table_MAUP.rds’ to have a match between the municipality code (CODGEO) and intercommunality code (EPCI).

Information

We can notice that the number of restaurants is very low or even equals to zero in most municipalities. We will therefore want to create a new map layer that partially uses the zoning of the municipalities and partially the EPCI zoning.
  1. We want to map the number of restaurants at the level of municipalities if its EPCI (intercommunality) contains more than 100 restaurants and at the level of intercommunalities if the intercommunality contains less than 100 restaurants. Create a map layer that meets this need.

Information

Start by creating the layers ‘EPCI_less100’ and ‘COM_more100’ corresponding respectively to the EPCIs which contain less than 100 restaurants and to the municipalities which belong to an EPCI containing more than 100 restaurants.

Use the do.call(rbind, list(EPCI_less100,COM_more100)) statement to merge these two layers.

Municipalities

Intercommunalities

Both

  1. Using the cartography package, simply add to each territory of this map a proportional circle layer related to the number of restaurants.
The propSymbolsLayer function allows you to draw proportional circles.

Exercise 2 : Maps with cartography and ggplot2 packages

  1. Data preparation:

Information

For the creation of ‘bks’ et ‘cols’, use the getBreaks et carto.pal functions of the cartography package. For the creation of the typo variable, you can use the cut function and apply the parameters digit.lab = 2 and include.lowest = TRUE.
  1. With the help of cartography package, make the following map which contains in a choropleth layer the variable nb_rest_10000inhab and in a proportional circle layer the variable nb_of_rest. Do the same thing with the ggplot2 package.

With cartography:

With ggplot2:

cartography

ggplot2

Exercise 3 : Smooth Density analysis of restaurants in Haute-Garonne

  1. Load the dataset ‘sir_31’ used previously and map the more than 4,000 restaurants of department 31 with the mapview package. Try using different parameters to customize your map.

Information

For example, you can use the map.types, col.regions, label, color, legend, layer.name, homebutton, lwd … parameters of the mapview function.
  1. Load the layer ‘iris_31’ used previously. Apply the pt_in_grid function below on the SIRENE data and on the iris of the department 31 to create a grid called ‘grid’ and count points in it. Use a 2km cellsize.
  1. Simply use the plot instruction to plot the grid. Then, using the choroLayer function of the cartography package, plot the number of restaurants per kilometer (called ‘n’) in each box of the grid. What observation can you make concerning the readability of the map?

The map is not very readable because of the choice of the discretization.
  1. Plot the distribution of the number of restaurants per box in the grid. What do you notice and what solution can you suggest?
The distribution is highly dissymetric. We could use instead a geometric progression classification.
  1. Try to make a more readable map.
Test the method = 'geom' and nclass = 9 parameters of the cartography::getBreaks function

  1. In order to make the result even more meaningful, we will now make a smoothed map. You can use the compute_kde function below.

Information

Use here a 2km sigma and a 2km resolution and the quantile method (n = 12).

  1. Try to improve the map resolution.

  1. Would we have obtained the same smoothing for this department if we had kept the restaurants of the bordering departments?
No. The borders would have been different.

Exercise 4 : Cartogram

Make a cartogram of department 31 at the level of intercommunalities in proportion to the number of restaurants (SIRENE data).

Information

Load the layer ‘epci_31.rds’ from previous exercises and the cartogram package.

Normal borders

Cartogram

Exercise 5 : Linemap

Make a linemap of department 31 at the level of municipalities in proportion to the number of restaurants (SIRENE data).

Information

Load the map layer ‘com_31.rds’ from previous exercises and the linemap package.
Use the two getgrid andlinemap functions of this package. The following parameters work: cellsize = 1750, k = 400 and threshold = 0.01.



reproducibility

R version 3.3.2 (2016-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=French_France.1252  LC_CTYPE=French_France.1252   
[3] LC_MONETARY=French_France.1252 LC_NUMERIC=C                  
[5] LC_TIME=French_France.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] linemap_0.1.0       cartogram_0.1.0     raster_2.6-7        maptools_0.9-2     
 [5] sp_1.2-5            spatstat_1.56-1     rpart_4.1-10        nlme_3.1-128       
 [9] spatstat.data_1.3-1 mapview_2.3.0       ggplot2_3.0.0       cartography_2.1.1  
[13] bindrcpp_0.2        dplyr_0.7.4         sf_0.6-3            knitr_1.20         
[17] stringr_1.3.1      

loaded via a namespace (and not attached):
 [1] viridis_0.4.0        jsonlite_1.5         viridisLite_0.3.0    foreach_1.4.3       
 [5] R.utils_2.5.0        shiny_1.0.5          assertthat_0.2.0     stats4_3.3.2        
 [9] yaml_2.1.18          pillar_1.1.0         lattice_0.20-34      glue_1.2.0          
[13] digest_0.6.15        RColorBrewer_1.1-2   polyclip_1.9-1       colorspace_1.3-2    
[17] htmltools_0.3.6      httpuv_1.3.5         Matrix_1.2-7.1       R.oo_1.21.0         
[21] plyr_1.8.4           pkgconfig_2.0.1      xtable_1.8-2         scales_0.5.0.9000   
[25] webshot_0.5.0        tensor_1.5           satellite_1.0.1      spatstat.utils_1.9-0
[29] tibble_1.4.2         mgcv_1.8-15          withr_2.1.2          lazyeval_0.2.1      
[33] gdalUtils_2.0.1.7    magrittr_1.5         mime_0.5             deldir_0.1-14       
[37] evaluate_0.10.1      R.methodsS3_1.7.1    foreign_0.8-67       class_7.3-14        
[41] tools_3.3.2          munsell_0.4.3        e1071_1.6-8          rlang_0.2.1         
[45] classInt_0.1-24      units_0.6-0          grid_3.3.2           unilur_0.4.0.9000   
[49] iterators_1.0.8      htmlwidgets_1.2      goftest_1.1-1        crosstalk_1.0.0     
[53] base64enc_0.1-3      rmarkdown_1.10.12    gtable_0.2.0         codetools_0.2-15    
[57] abind_1.4-5          DBI_1.0.0            R6_2.2.2             gridExtra_2.3       
[61] rgdal_1.2-11         rgeos_0.3-24         bindr_0.1            stringi_1.1.7       
[65] htmldeps_0.1.1       Rcpp_0.12.16         png_0.1-7            leaflet_1.1.0       

  1. In French, IRIS is an acronym of ‘aggregated units for statistical information’. Their target sizes are 2000 residents per basic unit.